106A22[1,RWF] - www.SailDart.org

perm filename 106A22[1,RWF] blob sn#728165 filedate 1983-09-22 generic text, type C, neo UTF8

COMMENT ⊗ VALID 00003 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 Algorithms
C00014 00003
C00015 ENDMK
C⊗;
Algorithms

by
Robert W. Floyd
Copyright l983

An algorithm is the idea of a particular method of computation, like the
idea of addition and subtraction using carries and borrows most of us
learned in school. As with other ideas, an algorithm expressed in one
language can be translated into another; the choice of language is not an
essential part of the algorithm. Familiar algorithms include those for
addition, subtraction, multiplication, and division; those for solving
simultaneous equations by successive elimination of variables; that for
differentiating a formula with respect to a variable; those for estimating
the area under a curve by approximation with line segments, etc. One of
the oldest, due to Euclid, finds the greatest common divisor of two
positive numbers:

(1) Let x be the larger number, y the smaller.
(2) If y=0, x is the answer.
(3) Otherwise, let the new value of y be the remainder when x is divided
by y; let the new value of x be the old value of y. Return to Step
(2).

Example: the greatest common divisor of 195 and 75. Successive values of
x and y are:

x = 195, y = 75
x = 75, y = 45
x = 45, y = 30
x = 30, y = 15
x = 15, y = 0

The answer is 15

An algorithm can be expressed in a language intelligible to man, machine,
or both. Euclid's algorithm, expressed in a language more suitable to be
carried out by a machine, looks like:

GET X
GET Y
IF X < Y THEN SWAP X WITH Y
LABEL A
IF Y=0 THEN RETURN RESULT = X
SET R = REMAINDER OF X DIVIDED BY Y
SET X = Y
SET Y = R
GO TO STEP A.

Computers are machines to carry out algorithms; to be useful, they must be
able to execute long, complicated algorithms on numerous data quickly and
reliably. A computer carries out an algorithm as expressed in one of
several precisely defined languages for that computer. An algorithm
expressed in a language executable by some actual computer is called a
program.

Programming, the design of programs, consists of two parts, often
interwoven: the design of an algorithm to solve a problem, and the
expression of that algorithm as a program within the limited notations of
a particular computer language. To learn to program, you must learn and
use some particular programming language, just as music is learned on some
particular instrument. The core of learning to program, though, is
learning to design algorithms.

In CS106, we use the Pascal programming language. It is popular with users
of small to medium-sized computers (``micros'' and ``minis''), and has
become a common language for communication of algorithms in print.
Pascal is not as well suited for the expression of business problems as
PL-I and COBOL; nor as well suited for engineering calculations as
FORTRAN; nor as well suited for processing symbolic information as SNOBOL
and LISP; nor ... well, you get the idea. No matter. Most of what you
will want to program can be said very similarly in most programming
languages, and after you learn one such language, you can learn any other
in a day or so.

So, you will learn Pascal, and, with labor and attention, how to design an
algorithm systematically and correctly. You won't learn all of Pascal from
the course. This is no oversight; parts of Pascal are largely of use to
more experienced programmers, and parts are of marginal usefulness. If
you have trouble with the problems, it won't be because you don't know
Pascal well enough.

My goals are to teach you to systematically and correctly design computer
programs more complicated than anything else you have ever designed in
your life, programs so sensitive to error that a single mis-typed symbol
will probably make the program incorrect. Ideally, you will learn to
program in such a way that your first drafts of programs will contain few
errors other than slips of the pen, and that your programs can be
systematically tested and corrected (``debugged'').

Programming without standards of quality is easy. Programming is a
difficult discipline if one believes a program must be utterly trustworthy
on all valid data; that it must detect and report all invalid data; that
its results must be intelligible and unambiguous; that other programmers
must be able to adapt the program to other languages, computers, and
problems than those for which it was originally designed, long after the
original programmer has vanished.

The course notes are interlaced with Rules of Good Programming Practice.
These are only a small subset of the 927 (or was it 928?) eternal truths,
but they are very useful, and we expect you either to adhere to them or
(since they have exceptions) explain why.

We also expect you to take responsibility for yourself. The computer
center can be a difficult environment. The computer may fail for a day at
a time. Lines to use the computer may be hours long at the end of the
quarter. Assignments may be more difficult than intended. We expect you
to begin projects as early as possible; if the computer fails six hours
before a program is due, that is your problem. We expect you to use the
often overloaded computer system in a way considerate to your fellow
students; in particular, when you no longer are sure what you are doing,
get off the computer and let someone else use it. Also, delete any files
you no longer need to release storage capacity for other users. We expect
your programs to work for all valid data, and not just for the test cases
you run; to deliberately design a program that only works for the data on
which it is tested will be considered cheating. We expect you to plan to
spend twelve hours a week on this course (the university guideline for a
4-unit course) including class and computer time; you will find the course
insuitable as part of an 18-unit program.

We expect you, as the assignments become more difficult, to include
adequate explanations of how your program works. Let the hard-pressed
grader, who perhaps just took the course the previous quarter, know in
outline what your methods are. And if your native language is English, we
would appreciate a demonstration of the fact. We expect you to respect the
privacy of other people's computer files; the fact that someone has not
fully protected his files does not give you the right to read them.

Okay, we've told you the worst. On the good side, the teaching assistants,
consultants, and professor are there to help out. Not by doing your work
for you, but by serving as models of how to think about programming. Most
of the time, the computer center is a friendly environment. And the
computer itself is the most versatile tool you will ever use. You have
signed up to vastly extend your powers to use and create information. I
think you will be glad you did.

ALGRMS[1, rfn]

**American Programming's Plight**

prof. dr. Edsger W. Dijkstra
Burroughs Research Fellow

A competent programmer's most important assets are--perhaps in this order--
an excellent mastery of his native tongue and a considerable methematical
maturity. These facts are well-known and have been well-understood for more
than ten years.
By a sad accident of history they make programming, which is difficult anyhow,
in the USE even more
COMMENT ⊗ VALID 00003 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 Reliable Algorithms
C00011 00003 An algorithm need not work with numbers. Here is an algorithm to find a word in
C00016 ENDMK
C⊗;
Reliable Algorithms

Robert W. Floyd
Copyright 1983

There is a traditional algorithm used by Russian peasants to multiply numbers
of several digits, based on doubling, halving, and addition. To see why it
works, let's first look at a particular multiplication problem.

How much is 38 x 45? Since 38 = 19 x 2, 38 x 45 = 19 x 2 x 45 = 19 x 90.
That's 90 more than 18 x 90, so

19 x 90 = 18 x 90 + 90 = 9 x 2 x 90 + 90 = 9 x 180 + 90

Since 9 x 180 is 180 more than 8 x 180,

9 x 180 + 90 = 8 x 180 + 180 + 90 = 8 x 180 + 270

= 4 x 2 x 180 + 270 = 4 x 360 + 270

4 x 360 + 270 = 2 x 2 x 360 + 270 = 2 x 720 + 270 = 1440 + 270 = 1710.

Now that the method is clear, the process can be shortened to filling out a table.
In each row, we have two numbers we want to multiply, and another number to be
added on to the result. For the calculation above, here is the table:

A B C

38 45 0
19 90 0
18 90 90
9 180 90
8 180 270
4 360 270
2 720 270
1 1440 270
0 1440 1710

Here is an explicit algorithm for making the table.

(1) The first row across contains, in columns A and B, the numbers we want to
multiply, 38 and 45 in this example, and in column C a zero.

(2) If A=0 in the bottom row of the table, we are finished; the entry in column C
is the answer. Otherwise we need to make another row.

(3) If A is even in the bottom row, we make the next row by halving the number
in column A, doubling the number in column B, and copying the previous number
from column C.

(4) If A is odd in the bottom row, we make the next row by subtracting one from
the number is column A, copying the member in column B, and adding the number
from column B to the number in column C.

Every row in the table stands for a formula; the fifth row, above, stands for
8 x 180 + 270. People who work with algorithms find ways to represent entities
that are not just numbers by sequences of numbers; this allows using computers
that only handle numbers to solve problems that involve not only numbers, but
pictures, words, formulas, logic, and myriad others.

The successive rows represent different formulas, but each formula stands for
the same number as the one before it, so they all stand for the same number.
We get from the problem to the solution by picking a first line that is easy
to construct from the problem, and getting to a last line from which it is easy
to construct the solution. In between, we need steps that are easy to carry
out, that progress toward an acceptable last line, and that change the question
without changing the answer. That is, we go from ``What is 8 x 180 + 270?'' to
``What is 4 x 360 + 270?'' without changing the answer, 1710.

A good algorithm is like good government; it involves both stability and progress.
Progress in solving a problem comes from changing it into a simpler problem;
stability comes from assuring that each new problem has the same answer as the one
it grew from. In the Russian peasant multiplication algorithm, progress is
quaranteed because columnn A gets closer to zero at every step; it can't go on
forever, since there are only a limited number of possible values A can have,
once we know the first value. Stability is guaranteed because in each row, the
formula AxB+C has the same value.

A similar algorithm, known to Euclid around 430 B.C., finds the largest number
that evenly divides two given numbers. If we want to reduce a fraction like
385/315 to its simplest form, we find 35 as the greatest common divisor (g.c.d.)
of 385 and 315, and say 385/315 = (385/35)/(315/35) = 11/9. We can get greatest
common divisors by trial and error, but there is a much more efficient way.
Here is a table given by Euclid's algorithm finding the g.c.d. of 385 and 3l5:

A B

315 385
315 70
70 315
70 245
70 175
70 35
35 70
35 35

The algorithm sets up the first row with the smaller number in column A, and
the larger in column B. As long as B is bigger than A, it makes the next row
by copying A, and reducing B by A (subtraction). If B gets smaller than A, it
makes the next row by exchanging A with B. If B is equal to A in a row, that
number is the greatest common divisor.

The principle of stability is that the common divisors of A and B are the same
in each row. In this example, in every row both A and B are multiples of
1,5,7, and 35. If we decrease B by A, the result is still a multiple of
1,5,7, and 35, but no new common divisors are introduced (a little easy algebra
will convince you).

The principle of progress can be formulated in several ways. One way is that
the value of 2A+B decreases at every step, while staying positive.

A more efficient formulation of Euclid's algorithm uses remainders (of division)
rather than subtraction. Here it finds the g.c.d. of 315 and 385 again:

A B

315 385
70 315
35 70
0 35

As before, in the first line A is the smaller datum, B the larger. When we get
to a line with A=0, we stop, and B is the answer. Otherwie, we find the
remainder when B is divided by A. In the next line, A is that remainder, while
B is copied from A in this line.

In the improved version of the algorithm, the principle of stability is the same;
all the rows have the same common divisors and therefore the same g.c.d. The
principle of progress is that A decreases at every step, without becoming negative.
An algorithm need not work with numbers. Here is an algorithm to find a word in
the dictionary.

Insert your left index finger between pages at the beginning of the dictionary,
and your right index finger at the end. Open the dictionary somewhere between
your fingers (If you can't, you've already found the right page). Look at the
word in the top left corner of the newly opened page. If it is alphabetically
earlier than the word you are looking for, move your left index finger into the
new opening; otherwise, move your right index finger there. Here is a record of
my looking up ``scutage'' in the Shorter Oxford English Dictionary.

Left finger Right finger New opening

page number word page number word page number word

1 A 2475 Zygin 1278 Monthly
1278 Monthly 2475 Zygin 1822 Sea-horse
1278 Monthly 1822 Sea-horse 1542 Pomatum
1542 Pomatum 1822 Sea-horse 1682 Redowa
1682 Redowa l822 Sea-horse 1730 Revolutionary
1730 Revolutionary 1822 Sea-horse 1780 Sailyard
1780 Sailyard 1822 Sea-horse 1800 Scantity
1800 Scantity 1822 Sea-horse 1810 Scorer
1810 Scorer 1822 Sea-horse 1816 Scriptural
1816 Scriptural 1822 Sea-horse 1820 Sea
1816 Sciptural 1822 Sea-horse 1818 Sculptile
1818 Sculptile 1820 Sea (none)

The principle of stability is that the word I'm looking for stays between my
fingers. The principle of progress is that the number of pages between my
fingers gets smaller.

Many algorithms embodied in computer programs are not reliable; when used on
certain data they give wrong answers, or they continue calculating until
someone intervenes. The way to make an algorithm reliable is to design it
around a principle of stability and a principle of progress. If you can't
formulate a principle of stability for an algorithm, it is likely to change
the problem it is solving as it goes along, and end up computing the right
answer to the wrong problem. If you can't formulate a principle of progress,
the algorithm is likely to run forever on some data. As the Russian peasant
multiplication and greatest common divisor algorithms show, it becomes much
easier to understand a new algorithm when its principle of stability (technically
known as an invariant) is presented with it.